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Recently, compressed sensing techniques in combination with both wavelet and directional 
representation systems have been very effectively applied to the problem of image inpainting. 
However, a mathematical analysis of these techniques which reveals the underlying geometrical 
content is completely missing. In this paper, we provide the first comprehensive analysis in the 
continuum domain utilizing the novel concept of clustered sparsity, which besides leading to 
i asymptotic error bounds also makes the superior behavior of directional representation systems 

over wavelets precise. First, we propose an abstract model for problems of data recovery and 
ry | derive error bounds for two different recovery schemes, namely l\ minimization and thresholding. 

Second, we set up a particular microlocal model for an image governed by edges inspired by 
seismic data as well as a particular mask to model the missing data, namely a linear singularity 
masked by a horizontal strip. Applying the abstract estimate in the case of wavelets and of 
shcarlets we prove that - provided the size of the missing part is asymptotically to the size 
of the analyzing functions - asymptotically precise inpainting can be obtained for this model. 
Finally, we show that shearlets can fill strictly larger gaps than wavelets in this model. 
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1 Introduction 

A common problem in many fields of scientific research is that of missing data. The human visual 
system has an amazing ability to fill in the missing parts of images, but automating this process 
is not trivial. Also, depending on the type of data, the human senses may be unable to fill in the 
gaps. Conservators working to repair damaged paintings use the term in-painting to describe the 
process. This word now also means digitally recovering missing data in videos and images. The 
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removal of overlaid text in images, the repair of scratched photos and audio recordings, and the 
recovery of missing blocks in a streamed video are all examples of inpainting. Seismic data are 
also commonly incomplete due to land development and bodies of water preventing optimal sensor 
placement lll'll 10, IHH08] . In seismic processing flow, data recovery plays an important role. 

One very common approach to inpainting is using variational methods [BBC+Oll IBBS01L 
BSCB00, CS02J. However, recently the novel methodology of compressed sensing, namely exact 
recovery of sparse or sparsified data from highly incomplete linear non-adaptive measurements by 
t\ minimization or thresholding, has been very effectively applied to this problem. The pioneering 
paper is |ESQD05|, which uses curvelets as sparsifying system for inpainting. Various intriguing 
successive empirical results have since then been obtained using applied harmonic analysis in com- 
bination with convex optimization [CCS101 ID.TL+121 |ESQD05| . These three papers do contain 
theoretical analyses of the convergence of their algorithms to the minimizers of specific optimiza- 
tion problems but not theoretical analyses of how well those optimizers actually inpaint. Other 
theoretical analysis of those types of methods (imposing sparsity with a discrete dictionary) typi- 
cally use a discrete model of the original image which does not allow the geometry of the problem 
to be taken into account. However, variational methods are built on continuous methods and may 
be analyzed using a continuous model, for example, [CKS02J. Also, some work has been done to 
compare variational approaches with those built on l\ minimization [CDOS12, Mey01|. Finally, in 
works such as [HFHlOj and |HH08j, intuition behind why directional representation systems such 
as curvelets and shearlets outperform wavelets when inpainting images strongly governed by curvi- 
linear structures such as seismic images is given. So, although there are many theoretical results 
concerning inpainting, they mainly concern algorithmic convergence or variational methods. 

The preliminary results presented in the SPIE Proceedings paper |KKZllj combined with the 
theory in this paper provide the first comprehensive analysis of discrete dictionaries inpainting 
the continuum domain utilizing the novel concept of clustered sparsity, which besides leading to 
asymptotic error bounds also makes the superior behavior of directional representation systems 
over wavelets precise. Along the way, our abstract model and analysis lay a common theoretical 
foundation for data recovery problems when utilizing either analysis-side l\ minimization or thresh- 
olding as recovery schemes (Sectional). These theoretical results are then used as tools to analyze 
a specific inpainting model (Sections|3]-[6]). 

1.1 A Continuum Model 

One of the first practitioners of curvelet inpainting for applications was the seismologist Felix 
Herrmann, who achieved superior recovery results for images which consisted of curvilinear singu- 
larities in which vertical strips are missing due to missing sensors. These techniques were soon also 
exploited for astronomical imaging, etc., the common bracket being the governing by curvilinear 
singularities. It is evident, that no discrete model can appropriately capture such a geometrical 
content. 

Thus a continuum domain model seems appropriate. In fact, in this paper we choose a distri- 
butional model which is a distribution wC acting on Schwartz functions g G 5'(R 2 ) by 

{wC,g}= [ w(xi)g(xi,0)dxi, 
J-p 

the weight w and length 2p being specified in the main body of the paper. Essentially, the weight w 
sets up the linear singularity that is smooth in the vertical direction, while the value of p corresponds 
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to the length of the singularity. Mimicking the seismic imaging situation, we might then choose 
the shape of the missing part to be 

Mh = l{|x-i|</i}' 

i.e., a vertical strip of width 2h. Clearly, h cannot be too large relative to p or else we are erasing 
too much of wC. Further, we let Pjv[ h an d Pu 2 \M h denote the orthogonal projection of L 2 (R 2 ) onto 
the missing part and the known part, respectively. One task can now be formulated mathematically 
precise in the following way. Given 

/ = P R?\M h w £> 

recover wC 

It should be mentioned that such a microlocal viewpoint was first introduced and studied in 
the situation of image separation [DK12]. 

1.2 Sparsifying Systems 

It was recently made precise that the optimal sparsifying systems for such images governed by 
anisotropic structures are curvelets [CD04| and shearlets |KL12} iKLllj, Of these two systems 
shearlets have the advantage that they provide a unified concept of the continuum and digital do- 
main, which curvelets do not achieve. However, many inpainting algorithms even still use wavelets, 
and one might ask whether shearlets provably outperform wavelets. In fact, we will make the 
superior behavior of shearlets within our model situation precise. 

For our analysis, we will use systems of wavelets and shearlets which are defined below. Both 
systems are smooth Parseval frames. Parseval frames generalize orthonormal bases in a manner 
which will be useful in the sequel. 

Definition 1.1. A collection of vectors = {v?i}iej in a separable Hilbert space % forms a Parseval 
frame for % if for all x E T~L, 

X>>w>l a = N 2 - 

With a slight abuse of notation, given a Parseval frame we also use $ to denote the synthesis 
operator 

$ : £ 2 (I) -> H, $({ci} iEl ) = *<Pi- 

iei 

With this notation, <3?* is called the analysis operator. 



1.2.1 Wavelets 

Meyer wavelets are some of the earliest known examples of orthonormal wavelets; they also happen 
to have high regularity [Dau92j Mey87| . We modify the classic system to get a decomposition of the 



Fourier domain that is comparable to the shearlet system that we will use. For the construction, 
let v £ C°°(R) satisfy u(-) + v(l — •) = Ir(-), where the indicator function 1a is defined to take 
the value 1 on A and on A c , and 



: x < 0, 

1 : x > 1. 
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Then the Fourier transform of the ID Meyer generator is defined by 

r e" 1 **/ 8 sin [f i/(16|fl-l)] : £ < |£| < f, 
W(0 = l e - 8 ^/ 3 cos[f^(8|e|-l)] : !<i£l<i 
^ : else, 

and the Fourier transform of the scaled ID Meyer scaling function is 

fl ■ lfl<&. 

m = \ «*[§!/ (i6 iei-i)] : ^<iei<|, 

[ : else, 

where we use the following Fourier transform definition for / G L (R n ) 

Ff:=f= [ f(x)e- 2m ^dx, 

(where (•, •) is the standard Euclidean inner product) which can be naturally extended to functions 
in L 2 (R n ). The inverse Fourier transform is given by 

T- 1 !:=]= [ f(Oe 2m{ ^dt 

We will not detail the interpretation of a scaling function but refer the interested reader to |Dau92, 
Mey87| . Then we define the C°° n L 2 (R 2 )-functions W v , W h , and W d by 

W v (0 = kti)W(&), W h (0 = Wfatffa), and W d (0 = Wfa)Wfa). 

We denote 

MO = 2-m\i^)e- 2 ^l 23 \ A = (i,j,k). 
Then the Parseval Meyer wavelet system is given by 

{ipx ■ A = (i,j,k),i G {h,v,d},j G Z,k G Z 2 } 

We have not yet shown that this system forms a Parseval frame. It is known (in various forms, 
for example |Chr03l ICS931 IDau92l lJm99l IKin09j ) that if for {ij/- G L 2 (R rf )} t 

EEEi^( 2 W( 2 ^- A; )i =0a - e - £ 

and 

^|^(2^)| 2 = la.e. & 
iez 

then 

{2 j ^°(2 j • -fc) : j G Z,k G Z d ,i} 
is a Parseval frame for L 2 (R a! ). The Meyer wavelet system defined above easily satisfies this. 
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Figure 1: Frequency tiling of Meyer wavelets. 



1.2.2 Shearlets 



We will use the construction of Guo and Labate of smooth Parseval frames of shearlets [GL12 
which is a modification of cone-adapted shearlets (see, for example |KL12| ). Let the parabolic 
scaling matrices A% and A v a and shearing matrices Sg and Sg be defined as 



A: 



Qfi 



a 2 

a 

1 s ' 
1 



and ST 



a 
a 2 



1 
s 1 



We use these dilation matrices as these are used in |GL12j and given particulars of their construction, 

it is not straightforward to adopt their methods to the dilation matrix ^ r . In addition, 

[ (J yj a 

given the fact that the matrices defined above always have integer values when a is an integer, 
they are reasonable from the point of view of implementation. Let V G L 2 (TL) n C°°(R) satisfy 
supp V C [—1, 1], and 



\V(Z + k)\ 2 = l, £e [-1,11- 



Further set V h (C) = V(&/&) and V V (0 = ^(6/6)- For £ = (6,6) G R 2 , define 

0(0 = 0(6)0(6) 

and 



w(0 = VI0(2- 2 6I 2 -I0(6I 2 - 

We define the following shearlet system for L 2 (R 2 ) 

{^:fceZ}U {o^ : j > 0, |i| < fc G Z 2 , l G {/i,?;}} U { W : j > 0, € = ±2 jf , A; G Z 2 } , (1) 
where 



for t G {h, v}, 



4>k = <P(; ~ k); 
tjiAO = 2-^' 2 W(2- 2 ^)V^A^SU)e 



2m{£A> 2 _.SL t ,k), 
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for j = and t = ±1, 



3 2vri«,fc> 



|6| 



< 1 



lfl> 1 



and for j>l,£ = ±Z>,a jAk (£) 



2-2i-2W(2- 2 ^)^(2^'| 
2-§J-lw(2- 2 ^)T/(2J| 



II <1 
H>1 



The tfj^ are the "seam" elements that piece together the and 4>k- We now have the following 
result from |GL12|, Theorem 5p]. 

Theorem 1.2. T/ie system defined in is a Parseval frame for L 2 (R 2 ). Furthermore, the 
elements of this system are C°° and band-limited. 

We will sometimes employ the notation 

where l £ {h, v, 0}, j £ Z, k G Z 2 , and I G Z. 

Fix a j > 0. Then the support of each a L - 1 k and &j,e ; k li es m the Cartesian corona 



Cj = [-2 2 ^ 1 ,2 2 ^ 1 ] 2 \[-2 2 ^ 4 ,2 2 ^ 



4n2 



(2) 



The position of the support inside the corona is determined by the values of I and l, with the "seam" 
elements &j t e t k having support in the corners. Thus, the shearlet system induces the frequency tiling 
in Figure [2] (cf. Figure [I] for the frequency tiling of Meyer wavelets). 




Figure 2: Frequency tiling of the shearlet system. 



1.3 Recovery Algorithms 

We next decide upon a recovery strategy. Compressed sensing offers a variety of such, the most 
common ones being l\ minimization and thresholding. We will also use these. However, for 
preparation purposes to derive an asymptotic scale dependent analysis - the fact that the energy 
of our model is arbitrary high frequencies requires this approach -, we first perform a band-pass 
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filtering on wC (see Eqn. Q). The band-pass filter will be roughly speaking chosen according to 
the band given by the wavelets and shearlets, see Figures [T] and [2j leading to the sequence 

(fj)j = ( P R 2 \M h w ^j)j- 
The t\ minimization problem we choose has the form 

Lj = argminill^LHi subject to fj = P R z\ Mh L, (3) 

where $ is a Parseval frame. We emphasize that this approach to inpainting minimizes the analysis 
coefficients and is hence related to the newly introduced cosparsity model [NDEG11, NDEG12J. 



The choice will be explained further in Subsection 2.2 

The thresholding strategy we choose is brutally simple. We only perform one step of hard 
thresholding, namely, setting Tj = {i ■ \(fj,<f>i)\ > f3j} for some threshold f3j, the reconstructed 
image is 

Lj = QIt&wCj. (4) 



For the asymptotic analysis, the j3j are explicitly computed in the proofs of Lemmas 4.4 and |5.5| 
In practice, as is usual with parameters in algorithms, one must be careful when selecting the f3. 

It will be surprising that the geometry of wavelets and shearlets is strong enough to achieve 
the same asymptotic recovery results as for l\ minimization for the respective systems. However, 
thresholding techniques can be viewed as approximations of i\ minimization and many parallel re- 
sults have been found for i\ minimization and thresholding. For example, l\ minimization [DK12 
and thresholding |Kutl2j applied to the geometric separation problem both achieve asymptotic 
separation. In fact, thresholding can be used to separate wavefront sets [Kutl2j . Iterative thresh- 
olding algorithms have successfully approximated solutions to such diverse sparsity problems as 
multidimensional NMR spectroscopy [Dro07] and finding row-sparse solutions to underdetermined 
linear systems |Foullj . 

1.4 Microlocal Analysis 




Figure 3: Left: Wavefront set of a curvilinear singularity C. Right: Wavefront set of a masked 
linear singularity M^wC 



One might ask where the geometry we mentioned before will come into play. This can best be 
explained and illustrated using microlocal analysis in phase space. For a more detailed explanation 
of the fundamentals of microlocal analysis, see [Hbr03j, and for an application of microlocal analysis 
to derive a fundamental understanding of sparsity-based algorithms using shearlets and curvelets, 
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sec [CD051 IGrolU IKL09j . Phase space in this context is indexed by position-orientation pairs (b, 9) 
which describe the singular behavior of a distribution. The orientation component 9 is an element 
of real projective space, which for simplicity's sake we shall identify in what follows with [0, n). The 
wavefront set WF(f) of a distribution / is roughly the set of elements in the phase space at which 
/ is nonsmooth. First consider a curvilinear singularity C along a closed curve r : [0, 1] — > R 2 : 



C = K(t)(-)dt, 



where 5 X is the usual Dirac delta distribution located at x. As illustrated in Figure [L4| the wavefront 
set of C is 

WF(C) = {(r(t),9(t)):te [0,1]}, 
where 9{t) is the normal direction of C at r(t). Now consider the model from Section [lTJ 

/ = PR?\M h w £- 



As can be seen in Figure 1.4 the wavefront set of / almost looks like / itself except that the 
wavefront set fills all possible angles (i.e., forms a spike) at the end points of the missing mask. 
This is because at the end points, the distribution is singular in all but the parallel direction. Note 
that the wavefront set of the linear singularity does not have spikes at the end due to the smooth 
weight. The difference between the approximate phase space portrait of shear lets and wavelets is 
demonstrated in Figure |4} The intuition behind the image comes from the fact that shear lets resolve 
the wavefront set |Groll4 IKL09] . Even though our shearlets and wavelets are smooth and thus do 
not have a wavefront set, by doing a continuous shearlet transform (/ h-> (/, a?/ 2 a(SiA a ■ —k))), 
one can get an approximation of phase space information which takes into account orientation, this 
is shown in Figure |4j This is similar in spirit to a wavelet spectrogram. 

Furthermore, in Figure [5] (Left) the small overlap of the wavefront set of a cluster of shearlets 
with a spike in the phase space, which represents an end point of the mask of missing information 
Mh, can be clearly seen. Thus shearlet clusters are incoherent with the end points, meaning that the 
clusters do not overlap the spikes strongly in the phase space. However, there is a lot of phase space 
overlap with the wavefront set away from the endpoints. So it is easy to see how easily a cluster 
of shearlets can span a gap of missing data (Figure [5] (Right)). Herrmann and Hennenfent call this 
property the "principle of alignment" which explains why curvelets "attain high compression on 
synthetic data as well as on real seismic data" |HH08j. The phase space information of curvelets 
and shearlets are essentially the same |GK12j . 



1.5 Asymptotical Analysis 

The width of the area to be inpainted plays a key role, even when using other inpainting techniques. 
In [CK06J, variational inpainting methods are analyzed theoretically, showing that the local thick- 
ness of the area to be inpainted affects the success of the inpainting more than the overall size of 
the area to be inpainted. 

Thus our analysis shall also take this into account. We accomplish this by also making the gap 
size h dependent on the scale j. This leads to the problem of recovering wCj from knowledge of 

fj = pR?\M hj w £j> 
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Figure 4: Left: Effective supports of wavelets (disks) and shearlets (ellipses). Right: Phase space 
portrait of the same wavelets (spikes) and shearlets (ellipses). 








Figure 5: Left: Phase space portrait of a cluster of shearlets and one single wavelet. Right: Phase 
space portrait of shearlets tiling a gap. 



for each scale j. Letting Lj denote the recovered image by either one of the proposed algorithms, 
we will show that asymptotically precise inpainting, i.e., 



I Li — wC 



\w£j 



0. 



J 



oo, 



is achieved for wavelets provided that hj 



o(2 2j ) (Theorems 
and 



5.4 



4.3 



and 



4.7) as j 



and for 



5.8) as j — > oo. In fact, this is exactly what 



shearlets provided that hj = o{2~ 3 ) (Theorems 
one would imagine. Inpainting succeeds provided that the gap size is comparable to the size of the 
analyzing elements. The scale-dependent gap size allows us to analyze dependency on the size of 
the shearlets and wavelets in a clear way, providing a theoretical understanding of how inpainting 
algorithms work even though in practice the gap size is fixed. 



1.6 Wavelets versus Shearlets 

This observation seems to indicate that shearlets indeed perform better than wavelets. However, 
the previously mentioned theorems just state positive results. In order to show that shearlets 
outperform wavelets in the model situation which we consider, we require a negative result of the 
following type: If hj > 0{2~ 3 ) as j — > oo and Lj is recovered by wavelets, then 



\L~ — wC 



\wCj\\2 



7^0, 



OO. 
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And in fact, this is what we will prove in Theorem |6,2| In this sense, we now have a mathematically 
precise statement showing that shearlets are strictly better for inpainting in our model. 

The only slight disappointment is the fact that this statement will only be proven for thresh- 
olding as the recovery scheme. We strongly suspect that this result also holds for l\ minimization. 
However, we are not aware of any analysis tools strong enough to derive these results also in this 
situation. 

1.7 Our Approach 

Our analysis has focused primarily on revealing the fundamental mathematical concepts which 
lead to successful image inpainting using wavelets or shearlets. The viewpoint we take, however, 
is that this is just the "tip of the iceberg," and the main results are susceptible of very extensive 
generalizations and extensions. For example, our asymptotic analysis is based on a vertical mask 
of missing data from a horizontal wavefront. Other masks applied to curved wavefronts could 
be considered. The microlocal bending techniques employed in [DK12j seem to suggest that this 
approach will yield desirable results. 

1.8 Contents 

We begin in Section [2] with an abstract analysis of data recovery via l\ minimization introducing 
clustered sparsity and concentration in a Hilbert space as tools. We then apply the results in 
Section [2] to a particular class of inpainting problems which are described in Section [3j In Sections [4] 
and[5j we prove that both wavelets and shearlets, respectively, are able to inpaint a missing band but 
that shearlets can handle wider gaps. It is shown in Section [6] that the inpainting result for wavelets 
in Section [4] is tight; i.e., shearlets strictly outperform wavelets in the considered model situation. 
We discuss future directions of research and limitations of the current model in Section [7j Finally, 
Section [8] is an appendix that contains auxiliary results concerning shearlets needed for Section [5] 

2 Abstract Analysis of Data Recovery 

We start by analyzing missing data recovery via l\ minimization and thresholding in an abstract 
model situation. The error estimates we will derive can be applied in a variety of situations. In 
this paper, - as discussed before - we aim to utilize them to analyze inpainting via wavelets and 
shearlets following a continuum domain model. In fact, these error estimates will later on be applied 
to each scale while deriving an asymptotic analysis. 

2.1 Abstract Model 

Let x° £ U be a signal in a Hilbert space H. To model the data recovery problem correctly, we 
assume that T~L can be decomposed into a direct sum 

H = H m ®'Hk 

of a subspace T~Lm which is associated with the missing part of x° and a subspace Hk which relates 
to the known part of the signal. Further, let Pm and Pk denote the orthogonal projections onto 
those subspaces, respectively. The problem of data recovery can then be formulated as follows: 
Assuming that Pkx° is known to us, recover x°. 
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Following the philosophy of compressed sensing, suppose that there exists a Parseval frame 
$ which - in a way yet to be made precise - sparsifles the original signal x°. Either $ can be 
selected non-adaptively such as choosing a wavelet or shearlet system which will be our avenue in 
the sequel, or <& can be chosen adaptively using dictionary learning algorithms such as [AEB06, 
IEAHH991IQF97] . 

To already draw the connection to the special situation of inpainting at this point, assume 
that Ti = L 2 (R 2 ). If the measurable subset B C R 2 is the missing area of the image, we set 
U K = L 2 (R 2 \ B) and U M = L 2 {B). 



2.2 Inpainting via l\ Minimization 

A first methodology from compressed sensing to achieve recovery is £\ minimization, which recovers 
the original signal by solving 



(Inp) x* = argminJI^xlli subject to Prx = Pkx°. 

We wish to remark that in this problem, the norm is placed on the analysis coefficients rather 
than on the synthesis coefficients as in [DE03, EB02] and other papers on basis pursuit. Since we 
intend to also apply this optimization problem in the situation when $ does not form a basis but 
merely a frame, the analysis and synthesis approaches are different. One reason to do this is to 
avoid numerical instabilities which are expected to occur since, for each x £ H, the linear system 
of equations x = $c has infinitely many solutions, only the specific solution <I>*x is analyzed. Also, 
since we are only interested in correctly inpainting and not in computing the sparsest expansion, 
we can circumvent possible problems by solving the inpainting problem by selecting a particular 
coefficient sequence which expands out to the x, namely the analysis sequence. A similar strategy 
was pursued in [KKZllj and [Kutl2j. Various inpainting algorithms which are based on the core 
idea of (Inp) combined with geometric separation are heuristically shown to be successful in [CCS 10, 
|DJLjl^ |E^QD05l . 



Interestingly, this minimization problem can be also regarded as a mixed £±-£2 problem |KT09] . 
since the analysis coefficient sequence $*x is exactly the minimizer of 

min{||c||2 : c G £2, x = <E»c}, 

that is, the coefficient sequence which is minimal in the £2 norm. The optimization problem in 
(Inp) may also be thought of a relaxation of the cosparsity problem 

x* = argmin x .||$*x||o subject to Prx = Pkx°. 

Theoretical results concerning cosparsity may be found in [NDEG11, ND EG12] . 

We also consider the noisy case. Assume now that we know x = Prx + n, where x° and n 
are unknown, but n is assumed to be small in the sense of ||$*n||i < e for small e. Also, clearly 
n = Pk u - Then we solve 

(InpNoise) x* = argminj|<l>*2||i subject to Pkx = x. 

To analyze this optimization problem, we require the following notion, which intuitively mea- 
sures the maximal fraction of the total £\ norm which can be concentrated to the index set A 
restricted to functions in Hm- In this sense, the geometric relation between the missing part Hm 
and expansions in $ is encoded. 
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Definition 2.1. Let $ be a Parseval frame, and let A be a index set of coefficients. We then define 
the concentration on T~Lm by 

k = k(A,%)= sup — . 

fen M II* /111 

This notion allows us to formulate our first estimate concerning the £2 error of the reconstruction 
via (Inp). The reader should notice that the considered error ||x* — %°\\2 is solely measured on Wm, 
the masked space, since Pk x * = Pk%° due to the constraint in (Inp). Another important notion 
is that of clustered sparsity. 

Definition 2.2. Fix 5 > 0. Given a Hilbert space % with a Parseval frame <3?, x 6 % is 5-clustered 
sparse in & (with respect to A if 

||l A c$*x||i < 5, 

where given a space X and a subset A Q X, A c denotes AT\A 

We now present a pair of lemmas which were first published in jKKZllj without proof. 
Lemma 2.3. Fix 5 > and suppose that x° is ^-clustered sparse in Let x* solve (Inp). Then 

28 



\x* -x°h < 



1 - 2k 



The noiseless case Lemma 2.3 holds as a corollary to the case with noise, which follows. 

Lemma 2.4. Fix 5 > and suppose that x° is (5-clustered sparse in <3?. Let x* solve (InpNoise). 
Also assume that the noise satisfies ||Q*n||i < e. Then 

on ^ 25 + (3 + K 2 )e 

x - x u 2 < — • 

11 11 1-2k 

Proof. Since is Parseval, 

P*-z°|| 2 < ||$*(x*-x°)||i. (5) 

We invoke the relation Pj<x* = Pxx° + n, which implies that Pk(x* — x°) = n. Using the definition 
of k, we obtain 

||l A $*(x* - x )^ < ||l A $*P M (x* - x )^ + ||l A **n||i < k\\^*P m (x" ~ z°)||i + ||**H|i 

< /t||$*(x*-x )||i + (l + K)||$*n||i<K||$*(x*-x°)||i + (l + «)e. (6) 

It follows that 

||**(x*-a; )||i = ||lA^(x*-x )||i + ||l Ac $*(r-x )||i < K||$*(x*-s )||i+||l A c$*(5*-x )||i+(l+K)e. 
The clustered sparsity of x° now implies 

||**(£* - x°)||! < — L_ (||l A c**(5* - x )||! + (1 + «)e) < (||l A c**x*||i + 8 + (1 + «)e) . 

1 — K 1 — K 
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Applying the sparsity of x° again and the minimality of x*, we have 

111^**5*11! = ||**5*||i - ||l A **£*||i < ||**(x° + n)||i - [|l A **£*||i 

< ||**x°||i - ||l A **£*||i + e < ||**x°||i + ||l A **(x* - x°)\\i - ||l A **2; ||i + e 

< \\l A **(x*-x°)\\i + 6 + e. 

Using Q and (2.2), this leads to 

||** (x*-x°) ||i < — L(||l A c**x*||i + 5+(l + K) e ) 

1 — K 

< r l-(||l A **(r-x )|| 1 + 25) + ^±^ 

< -^-( K ||^(r-x°)|| 1 + 2^) + (3 i +2K)e . 

1 — K L — K 

Combining this with ([5]), we finally obtain 



x* -x°h < [ 1 



-i 



25 + (3 + 2k)€ 25 + (3 + 2n)e 



1 — K J 1 — K 1 — 2k 



□ 



We now establish a relation between the concentration k(A,T~Lm) on and the notion of 
cluster coherence fj, c first introduced in |DK12j . For this, by abusing notation, we will write Pm* = 
{PMtyi)i and Pftr* = {-PffViji for the projected frame elements. 

To first introduce the notion of cluster coherence, recall that in many studies of l\ optimization, 
one utilizes the mutual coherence 

/u(*i,* 2 ) = maxmax|(<^H,^ 2 ,-}|, 

whose importance was shown by [DHOlJ. This may be called the singleton coherence. We modify 
the definition to take into account clustering of the coefficients arising from the geometry of the 
situation. 

Definition 2.5. Let *i = {<fu}i£i and * 2 = { ( P2j}jeJ he in a Hilbert space % and let AC J. 
Then the cluster coherence // C (A, *i; * 2 ) of *i and *2 with respect to A is defined by 



/x c (A, *i;* 2 ) = max)] |(y li; y 2j -)|. 



iGA 

The following relation is a specific case of Proposition 3.1 in [K KZllj . We include a proof for 
completeness. 

Lemma 2.6. We have 

K{A,n M ) < Mc(A,Pm*;Pm*) =/x c (A,P m *;*). 
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Proof. For each / £ Hm, we choose a coefficient sequence a such that / = <J>a and < 

for all /3 satisfying / = Invoking the fact that $ is a tight frame, hence / = <&<J>*3>0!, and the 

fact that / = (Pjvf$)a, we obtain 



|U<&7lb 



|1a(Pm*)7IIi = \\U(PM$y(PM$)a\\i 



ieA y j J j V«eA / 

< /i c (A,P A/ ^;PA^)||a||i </i c (A,PA/^;PM$)||**$a||i 

= ^ c (A,p m $;Pm$)II$7IIi- 



□ 



Combining Lemmata 2.3 and 2.6 proves the final noiseless estimate and combining Lemmata 



2.4 and 2.6 proves the final estimate with noise: 



Proposition 2.7. Fix 5 > and suppose that x° is 5-clustered sparse in <£. Let x* solve (\hf). 
Then 



26 



l-2// (A,P M $;$) 



Proposition 2.8. Fix 5 > and suppose that x° is 5-clustered sparse in Zei x* so/we (InpNoise 
AZso assume that the noise satisfies ||<3?*n||i < e. Then 



< 



25 + (3 + 2 K )e 
1-2^ c (A,P m $;^) 



Let us briefly interpret this estimate, first focusing on the noiseless case. As expected the error 
decreases linearly with the clustered sparsity. It should also be emphasized that both clustered 
sparsity and cluster coherence depend on the chosen "geometric set of indices" A. Thus this set 
is crucial for determining whether $ is a good dictionary for inpainting. This will be illustrated 
in the sequel when considering a particular situation; however, A is merely an analysis tool and 
explicit knowledge of it is not necessary to recover data. Note that in general, the larger the set 
A is, the smaller ||1a c $*x°||i is (i.e., x° is <5-relatively sparse for a smaller 5) and the larger the 
cluster coherence is. This seems to be a contradiction, but if $ sparsifies x° well, then a small set A 
can be chosen which keeps ||1a c $* small. Finally, considering the noisy case, as also expected 
the error estimate depends linearly on the £2 bound for the noise. 



2.3 Inpainting via Thresholding 

Another fundamental methodology from compressed sensing for sparse recovery is thresholding. 
The beauty of this approach lies in its simplicity and its associated fast algorithms. Typically, it is 
also possible to prove success of recovery in similar situations as in which l\ minimization succeeds. 

Various thresholding strategies are available such as iterative thresholding, etc. It is thus 
surprising that the most simple imaginable strategy, which is to perform just one step of hard 
thresholding, already allows for error estimates as strong of for l\ minimization. We start by 
presenting this thresholding strategy. For technical reasons, - note also that this is no restriction 



14 



One-Step-Thresholding 




Parameters: 




• Incomplete signal x = Fkx (noiseless) or Fkx 


+ n (with noise). 


i i \ i ill* i /") 

• thresholding parameter p. 




Algorithm: 




1) Threshold Coefficients with Respect to Frame $: 




a) Compute (x,(f>i) for all i. 




b) Apply threshold and set T = {i : \{x, (f>i)\ > 


py. 


r\\ I ) j_± S~i • • 1 a • 1 

2) Reconstruct Original signal: 




a) Compute x* = §1<y^*x. 




uutput: 




• Significant thresholding coefficients: T ■ 




• Approximation to x°: x* . 




One-Step-Thresholding Algorithm to reconstruct 


x° from noiseless Prx 



at all - we now assume that the Parseval frame $ = {4>i)i consists of frame vectors with equal 
norm, i.e., \\4>i\\ = c for all i. 

The following result provides us with an estimate for the £2 error of the synthesized signal x* 
computed via One-Step-Thresholding. 

Proposition 2.9. Let T and x* be computed via the algorithm One-Step-Thresholding (Figure 
[fi|) for noiseless data, and for 5 > assume that x° is relatively sparse in $ with respect to T. Then 



x — x 



°h<c[5+\\tr^*PMX°\\i} . 



As before, Proposition 2.9 follows as a corollary to the case with noise: 



Proposition 2.10. Let T and x* be computed via the algorithm One-Step-Thresholding for 
data with noise, and for 5 > assume that x° is relatively sparse in $ with respect to T '. Also 
assume that the noise satisfies ||$*n||i < e. Then 



\x*-x°\ 



2 < C (||l r $*P M x ||i + <5 + e). 
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Proof. Invoking the decomposition of H and the fact that is Parseval, 

[jar* - x°\\ 2 = \\$1 t $*(Pkx + n) - <Z>$*P K x° - P M x°\\ 2 = \\&1t°$*Pkx° + ^ r ^*n - P M x°\\ 2 . 
Since 

P MX ° = $l r $*P M x° + <f>t T e$*P M x° 
and Pkx° + Pmx° = x°, it follows that 

\\x* - x°\\ 2 < \\<S>t T c<S>*x°\\ 2 + ||$l r $*P M x || 2 + ||$l T $*n|| 2 . 

It follows from the equal-norm condition on the frame <3? that for any l\ sequence x, 

\\$x\\ 2 < c||x||i. 

Applying the clustered sparsity of x° we obtain 

\\x* - x°\\ 2 < c (||l r $*P M x°||i + 5 + e) , 

which is what we intended to prove. □ 

As before, let us also interpret this estimate. Now the situation is slightly different from the 
estimate for the l\ approach. Again, the estimate depends linearly on the clustered sparsity and the 
noise. The difference now is the appearance of the term || 17-^*^^ ^° || 1 in the numerator instead 
of the cluster coherence in the denominator. Note, however, that 

Ut^PmxX < kW&PmxX < h c {T,Pm®;®)WPmx°\\i. 



Thus both in the l\ minimization case Proposition |2 . 7| and in the thresholding case Proposition 2.9 
the bound on the error is lower when the cluster coherence is lower. Furthermore, ||$*PAf£C ||i is a 
quantification of how much of the signal is missing, which clearly can not be too high. 



3 Mathematical Model 

We next provide a specific mathematical model which is motivated by the fact that images are 
typically governed by edges, which can most prominently be seen in, for example, seismic imaging 
(Figure [7]). Following this line of thought, our model is based on line singularities - which can as 
explained later be extended to curvilinear singularities - with missing data of the forms as gaps or 
holes. In this section, such a model for the original image and the mask will be introduced. Since 
the analysis we derive later is based on the behavior in Fourier domain, the Fourier content of the 
models is another focus. 

3.1 Image Model 

Inspired by seismic data with missing traces, an example of which is found in Figure [7j we define 
our mathematical model. The data can be viewed as a collection of curvilinear singularities which 
are missing nearly vertical strips of information. We first simplify the model by considering linear 
singularities. As shearlets are directional systems, we then simplify the model so that the linear 
singularity is horizontal. The specific mathematical model that we shall analyze is as follows. Let 
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Figure 7: Synthetic seismic data with randomly distributed missing traces - Hennenfent and Her- 
mmann [HH06 



w : R h- )• [0, 1] be a smooth function that is supported in [— p, p], where we always assume that p 
is sufficiently large, in particular, much larger than h (a measure of the missing data which will 
be defined in the next subsection). For now, we consider as a prototype of a line singularity the 
weighted distribution wC acting on tempered distributions 5'(R 2 ) by 

(w£,f}=[ w(x 1 )f(x 1 ,0)dx 1 . 
J-p 

Notice that this distribution is supported on the segment 

[~P,P] x{0} 

of the x-axis, hence can be employed as a model for a horizontal linear singularity. The weighting 
was chosen to ensure that we are dealing with an ^-function after filtering. The Fourier transform 
of wC can be computed to be 



(wCJ) = (wCJ}= / w(£i) / /(^fcRzdCi. 
Jr Jn. 

Let now Fj be a filter corresponding to the frequency corona Cj at level j (see Equation Q). 
defined by its Fourier transform Fj, 

Fj = {W L (2- 2 ^) + W\2- 2 i- 1 i)) . 

To simplify the proofs for wavelets, we also define 

L&{h,v,d} 

so that Fj = F2j+F2j+\. We use two bands for the wavelets so that the wavelet and shear let systems 
will be compared on the same frequency corona. This makes sense as the base (j = 1) dilation for 
the 2D wavelets has determinant 4, while the base dilation for the shearlets has determinant 8. We 
consider the filtered version of wC which we denote by wCj, i.e., 

wCj =wC*Fj= wC{-- t)Fj{t)dt. (7) 

J-R? 

The next result provides us with an estimate of the norm of wCj . 
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Lemma 3.1. For some c > 0, 

||w£j||2 > c2 J , j — > oo. 

Proof. We have 

/ \ 1/2 

\\wCjh > [ H£i)| 2 <i6 / d& 1 «c3?. 



3.2 Masks 



□ 



Inspired by the missing sensor scenario in seismic data we will define the mask of a missing piece 
of the image as follows. The mask Ai^ is a vertical strip of diameter 2h and is given by 

■Mh = ^{\xi\<h}- 

For an illustration, we refer to Figure [8) 



Figure 8: Mask Mh (gray shaded region), together with the linear singularity wC (horizontal line 
with dashed center indicating part masked out). 

For the convenience of the reader, we compute the associated Fourier transforms, where as usual 
we set sinc(y) = sin(7ry) / (iry) for y G R. 

Lemma 3.2. We have 

M h = 2h sinc(2^i)£ y , 
where C y is the distribution acting as 

(£ y ,f) = j f(0,y)dy 

and (C y ,f) = J f(x,0)dx. 

Proof. Define the planar Heaviside by H(x) = l{ zl >o}- Since C y = gf^-ff, we have H(£) = 
{2m^i)~ 1 C y . We now express Mh in terms of H by 

M h = H(x + (h, 0)) - H(x - (h, 0)). 

This leads to 

M h = (e 2 ™^ 1 - e~ 27ri ' l$1 )(27r^i)- 1 4 = 2sin(27r/i£i)/(27r£i)£ y = 2h smc(2h^)l y . 
The proof is finished. □ 
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3.3 Transfer of Abstract Setting 



All of the main proofs in Sections [4] and [5] will follow a particular pattern. Either Proposition 2.7 



(in the case of l\ minimization) or Proposition 2.9 (in the case of thresholding) is applied to the 
situation in which x° is chosen to be the filtered linear singularity wCj, the Hilbert space Hm 
is defined by {fM.h '■ f £ L 2 (R 2 )}, and <3? is either the Parseval system of Meyer wavelets or of 
shearlets at scale j. 

In the analysis that follows, 5j will denote the optimal 5-clustered sparsity for filtered coefficients. 
That is, for i\ minimization with a fixed filter level j, we will fix a set Aj of significant coefficients 
of <I> = {ip\}\ and set 

Similarly, we will analyze thresholding schemes by setting 

6 3 = I( w A'>^a)|, 
XeTf 

where the Tj are the significant coefficients in One-Step-Thresholding Algorithm. The inpaint- 



ing accomplished (i.e., the solution in Proposition 2.7 or Proposition |2.9[ ) on the filtered levels j 
will be denoted by Lj. wCj will denote the filtered real image; that is, w£* Fj, where wC is the 
original, complete image. Thus, the main theorems in Sections [4] and [5] will show that 

\\Lj - w£j\\ 2 _ 

ii r u >0, J^oo. 

\\wLj\\2 

The results will specifically depend on the asymptotic behavior of the gap hj. For the proofs 
involving the Meyer system, the following notation will also be useful 

wCj = wC * Fj" . 



4 Positive Results for Wavelet Inpainting 

We begin by proving theoretically for the first time what has been known heuristically; namely, 
that wavelets can successfully inpaint an edge as long as not too much is missing. In Section |4.1[ 
we investigate the inpainting results of £± minimization by estimating the <5-clustered sparsity 5j 
and cluster coherence [i c with respect to = {-0a : A = (i,j,k),t = h,v,d;k G Z 2 } and a proper 



chosen index set Aj. In Subsection 4.2, we similarly give the estimation of 5j and \x c for inpainting 
using thresholding. 



4.1 l\ Minimization 

In what follows, we use the compact notation (a) := (1 + lap) 1 / 2 . We first need to choose the set 
of significant coefficients appropriately. We do this by setting 

Aj = {(i; j, k) : \ki\ < pnj2 J , \k2\ < rij, i = h,v,d}, 

where rij = 2 ej . This choice of Aj = A 2 j U A 2 j+i forces the clustered sparsity to grow slower than 
the growth rate of ||u>£j||2: 
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Lemma 4.1. Sj = o(l) = o(||iojCj H2), j — > 00. 
Proof. By definition, we have 

S j = E \( w£ iM\ ^ ^2(\( w£ 2jM\ + I (w£2j+lM 



AeA 5 AeA ? 



< Y {\{wC 2 j^\)\+ E \i wC 2j+i,^> 
AeA=. AGA5 

=: S 2 j+S 2 j+i. 



l 2j + l 



We now compute 



that is, 



h = E \( wC iM\ = E \( wC ii$\) 
AeA? AeA'r 



h = Y 



AeA!; 



R 2 



2-^(a)^(0^(C/2 J )e 



d{ 



AeA^ 

where Gj(£,) = 2~ 3 w(^i)Fj(^ t )W''(^/2 J ) is a smooth and compactly supported function that is 
essentially supported on 

[-i/pMp] x [-2^0,2^0]. 

Applying the change of variable (£1, £2) ^ {p~ l ii, 2 3 ^ 2 ) ensures that Gj(p~ 1 ^\, 2 J £ 2 ) is smooth and 
compactly supported independent of j. Then 



-2m{k,£,/2i) 



R2 

Consequently, Sj/cN is bounded above by 

P - i ^j2{\(p~ 1 ki/2 j ,k 2 )\)- N 



< CN\\G J \\ OQ {P~ 1 ^){\{P~ 1 ki/2^k 2 )\)~ N < CNip-^Mp- 1 ^, k 2 )\)- N . 



AeA? 



< p-^i E <l(^r>*2)ir"+ E <I(^W 



■N 



\k 1 \>pn j 2i ,k 2 



\ Jpn j2 i JR ZJ 



\k 1 \<pn j 2i ,|fe 2 |>Tij 

x 2 )\r N dx 2 d X1 + I I (\{ p — 1. 



Jn 



x 2 )|) N dx 2 dx\ 



< 

I rij Jr. 



(|(xi,X2)|) _Ar dx2(ixi + / / (|(xi,x 2 )|) -jv da;2da;i 



Thus, 



and for N large enough, Sj — > as j — > 00. 



□ 
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On the other hand, the choice of Aj offers low cluster coherence as well: 
Lemma 4.2. For hj = o(2~ 2j ) as j — > oo, we have 

Hc(A j: {MhjipxY, {V'a}) ->■ 0, j -> oo. 

Proof. We again first consider Aj. By definition, we have 

fi c (A j ,{M hj ipx};{ipx}) = max ^ | (A^^.^a, V'A') | = max ^ (A^. *i>\A> 



AeA, 



AeA, 



Note that for A = (i, j, k), we can choose A' = 0). 



R 2 JR 2 



[ [ 2h jS mc(2h^i)Mr-^i,0))d^y(T)di 
Jr 2 Jr. 



r 



2 J 2hj 



2hj sinc(2hjS,i) 



R2 2-? 23 



dtx 



R 2 



sinc(2 J 2/i j ei)^ t ((r - 0)))e 2 ^ fcl6 ^i^ t '(r) 



R 



iHhj / 5 j (r)e- 27ri < fc ' T ^r, 

JR2 



where 



^.( T ) := W * ( T ) / sinc^/i^W^T - (&,0)))e 



27ri(fc 1 ,?i> 



(8) 



R 



is a smooth function supported on a box independent of j. Hence, \f gj(r)e 2mkr dT\ < Cjv||<7j||oo(|&|) ) 
and 

HdiHoo < csup J I sinc(2 J '2^a)||W(r - (£i,0))|d£i < c|| sinc(2 Jr ' 2^ - ) || 2 < c(2 J '/i)~ 1/2 . 
Consequently, we have 

mA^a^aM^a}) < CNVhjiVhj)- 1 ' 2 (\k\y N < CNi^h,) 1 / 2 , 

fcez 2 

where 

Hc(Aj, {M hj ip x }; {V'a}) = Mc(A 2 j, {.M^a}; {V>a}) + Mc(A 2j+ i, {-Mfc^A}; {V'a})- 
which goes to as j — > oo by assumption. □ 
We would like to remark at this point that we do not need the strong condition that hj = o(2~ 2j ) 



as j — > oo. In fact, carefully handling the constants in the proof of Lemma 4.2 will lead us to the 
condition 

H c (Aj,{M h ^ x };{^\}) < CN{2 2j hj) 
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with precise knowledge of the value of cm- Since ultimately, we "only" need the cluster coherence 
to boundedly stay away from 1/2, we only require the weaker condition of 



2 2i h» < 



2c N 



e for some e > and for all j > jo- 



This condition would then be also sufficient for deriving the following theorem. 



We now apply Proposition 2.7 to Lemmata 3.1 4.1, and 4.2 to obtain the desired convergence 
for the normalized £2 error of the reconstruction Lj derived from ([3]) , where in this case L = wCj 
and are wavelets ip\ at scale j. 

Theorem 4.3. For hj = o(2~ 2j ) and Lj the solution to with <I> the 2D Meyer Parseval system, 



\Li — wC 



0, 



j -> 00. 



\\wCjW2 

This result shows that if the size of the gap shrinks faster than 2 _2 - ? - i.e., the size of the gap is 
asymptotically smaller than 2 _2j - or if the gap shrinks at the same rate than 2 _2j with an exactly 
prescribed factor, we have asymptotically perfect inpainting. 

4.2 Thresholding 

We will now study thresholding as an inpainting method, which is from a computational point of 
view much easier to apply than l\ minimization. Our analysis will show that we can derive the 
same asymptotic performance as for l\ minimization. 

Our first claim concerns the set of the thresholding coefficients Tj. 

Lemma 4.4. For hj = o(2~ 2 i) as j — > 00, there exist thresholds {(3j}j such that, for all j > jo, 

{k : \ki\ < p2 2j ( 1+n '\ \k 2 \ < 2 2jni } C Tj 

for positive jo and n\. 

Proof. We again first analyze wCj. By Plancherel, we can rewrite the coefficients which we have 
to threshold as follows: 

|((1 - M hj )wCj,ip\)\ = {(So-kwCj^x) - {M hj -kwCj^ x )\- 
Choose a function F such that F(-/2 J ) = Fj. Then, 

wCfc) = wC{Om) = w2(0F(U2'). 
As we are analyzing a horizontal line singularity, we only need to consider 

tJj x = 2- j W v (^/2 j )e- 2wi{k ^ /23) 
for large wavelet coefficients. Then, the first term equals 



ti(ti)(FW v )((Z 1 /2>,S 2 ))e- 2 * i < kl / 2i Md£ i 



-27ri(fc2,C2> 



d& 
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By using Lemma 3.2 we derive for the second term: 



= 2hj2-i [ smc{2hjTi) [ w^Ffa/V , ^/2 3 )W v ((t 1 + 6/2 J >- 2 ™ (fc/2 '' T1+ ^ 2) d£ciTi 
= 2h 3 

Let G now be the function 



u>(£l) / sinc((/ lj /7r)r 1 )F(6/2 J ,6)W(((n +ei)/2 J ,e 2 ))e- 2m<fcl ' (ri+?l)/2J) ( ir 1 dei 



G(6 



(W)((6/2^ 2 )) + 



-2/y J sinc((/ li /^)r 1 )F(6/2 J ,6)W((n + £i)/2*, 6)e- 2 ^ (fel/2J>1 dr 1 



-2iri(fa/2 j )( 



u;(6)%(^i)^ 27rj(fcl/2J)?1 ^i 



with 



= (FTn((£i/2 J ,6)l)-2^ / sinc((/ lj /7r)r 1 )F(6/2^e2)^((ri+6)/2 J ,6)e" 2 ^ <(/ci/2J) ' Tl> dn. 



The function G is supported on the set [1/2, 2], which is independent of j. By standard arguments, 
we can deduce that 

Kil-Mh^wLj^x)] ^cjvJGIUM)-^. (9) 
Let us now investigate the term ||G||oo further. Using Plancherel and the support properties of w, 

r . i—fa/tf+p 

/ w(— k\/2 3 — x)H^ 2 {x)dx ps c / H^ 2 (x)dx 

J " J— fa/23— p 

For the analysis of the function -ff^, we use well-known properties of the Fourier transform to 
derive 

%(x) = ((JW*)(./2*,6»)) V (x) ~{2hj sinc(2/ l ,-)e- 2 -^/ 2J >) * ((fT)^, &)f (-^ 

= ((Fr)(./2^ 2 )) V (x) - (2/1,- sinc((2/ lj -))e- 2 -( fcl / 2J )-) V (-x) ((Fr)(./2^ 2 )) V (-*) 

= ((F^)(-/2^,6)) V (*) - - fci/a') W-M6)) v (-*)• 

Hence, since hj < p, 



-fa/2i+p 
fa/2i-p 



H^ 2 (x)dx 



fa/23+p 



ki/2i+hj 



{{FW v ){-/2^i 2 )Y(x)- / ((FW v )(-/y,&) v (x)dx 

fa/23 -p Jfa/23-hj 
ki-Z>hj rfa+23p 



+ / ((FW v )(\(-,&)\)) V (x)dx 

fa-23p Jfa+2ihj 



(10) 
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Notice that the bounds of integration indeed make sense, since the values of k\ which lie "in between 
hj and p" should play an essential role. Due to the regularity of W, there exist some N 2 and c 
(possibly differing from the one before, but we do not need to distinguish constants here) such that 



and hence by (10) 



\((FW V M;t 2 )\nx)\<c(\ X \)- N >, 



|G||oo < c(min{|A:i - 2*p|, \h + V p\}) 



-N 2 



(11) 



Finally, we have to study how the function H relates to h, which will show the behavior of the 
coefficients as they approach the center of the mask. For this, setting 



we obtain 



4(n) = (FVT)((n + £i)/2^6)e- 27r ' (fel/2 V 1> , 

\{FW v ){Z 1 /2>,Z 2 )-2h j J sinc((/ lj /vr)r 1 )(F^)((r 1 + 6)/2 J ,6)e- 2m<fcl/23 ' Tl> dr 1 | 
= I J| 2 (0) — 2/ij / sinc((/ij/7r)ri) J^ 2 (ri)dri| 



-hj 

Hence another way to estimate UGH 00 is by 
I \G\ 00 



I J 6(°) - / M-hj^iT^Jbin^nl 

1^6(0) — / J&( x )dx\ = \ / J£ 2 (x)dx\. 



\x\>hj 



< H, 



2 II CO 



< max 

6,6 



< max 

6 



((FW V )((- + ei)/2 J ,6)) V (^ - h/V)dx 



\x\>hj 



\x\>2ihj 



((FW v )((-^2)nx-h)dx 



Certainly, the minimum is attained in the center of the mask, i.e., with k = 0. So combining this 
with ^ and ( fTTj ), 



\((1-M hi )w£j,ip\)\ < cmax 



\x\>23hj 



((F^)((-,6))) V (x)dx 



(|fc 2 |)- Ari (min{|fe 1 -2^|,|fc 1 +2V|}) 



-JV2 



which is what we intend to use as a "model." Observe that this indeed is also intuitively the 
right estimate, since the k 2 component has to decay rapidly away from zero, thereby sensing the 
singularity in zero in this direction. In contrast, the ki component stays greater or equal to 
(2 3 p)~ N2 up to the point 2p2 J and then decays rapidly in accordance with the fact that up to the 
point k\ = p2 3 we are "on" the line singularity which decays smoothly with w. Moreover, the first 
term models the behavior in the mask, which is also nicely supported by the fact that the crucial 
product 2 2 ^hj is appearing therein. 
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We now apply the triangle inequality 

\{(l-M hj )wCj,^ x )\ < \{{\-M hj )wL 2j ^ x )\ + \{{l-M hj )wL 2j+1 ,i> x )\. 
Since 2 2j hj — > and 2 2j+1 /ij — > as j — > oo, we have as j — > oo 



max 

6 



>\x\>2^hj 

We now set the thresholds (3j to be 



((FWn((;^))r(x)dx 



a 



c(C-e) 



(|2 2 ^|)^ (min{|(2 2 ^ - l)2 2 ip|, \(2 2 ^ + l)2 2 ip|})^ " 

This choice immediately proves the claim of the lemma. □ 

Note that Aj C {k : \ki\ < p2 2j( - 1+ni \ \k 2 \ < 2 2jni } C 7j for some n x > 0. For such a choice of 
Tj, we have the following lemma. 

Lemma 4.5. 

5j = ^2 \( w ^-j^x)\ = o(\\wCj\\ 2 ), j -> oo. 



Proof. We observe from the proof of Lemma 4.1 that the desired property is automatically satisfied 
provided that, for all j > jo, the set Tj satisfies 



Tj^{k: \k x \ < P 2 2j{1+U '\ \k 2 \ < 2 2ju '} D Aj, 



for some v\ > 0, which is implied by Lemma 4.4 



□ 



We next analyze the second term in the estimate from Proposition 2.9 
Lemma 4.6. For hj = o(2~ 2 i) as j — > oo, 



Y,\(M h] wC 3 ^ x )\ = o{y 



J — ^ oo. 



Proof. We first need to derive some estimates dependent on k for the term ^JVl^wCj^x^. By 
using the definitions of M-u an d wCj and a change of variables, we first obtain (M.^ w£,j, if)\) = 



2hj 



w(^) / sinc((2/ lj )r 1 )F(ei/2^6)^(((r 1 +ei)/2^e 2 ))e- 2m<fcl ' (Tl+ « l)/2: ' ) dr 1 dei 



21 d& 



Here F{-/2 3 ) = F. Let G now be the function 
G(6) = J w(ii)2hj J sinc((/ lj V^)r 1 )F(ei/2^e2)^((r 1 +6)/2 J ,6)e" 27ri< ' £l/2J ' Tl+fl> dr 1 dei 



-27ri(fcl/2^,ei> 
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= 2/»i / sinc((^/7r)ri)F(Ci/2^6)^((ri + 6)/2 i ,6)e~ 27ri<fel/2i,n> ^i. 



with 



The function G is supported on the set [—1/4,-1/16] U [1/16,1/4], which is independent of j. 
Hence, we have 

KMhjwCjMl < c^HGIIoodfcal)-^ 1 . (12) 
By Plancherel's theorem and the support properties of w, 



\(wH^(-k 1 /V)\ = \(w*Hz u )(-k 1 /y)\ 



w(—k\j2 3 — x)H^ 2 (x)dx 



l-ki/23-p 

Next, using well-known properties of the Fourier transform, we can manipulate H^ 2 : 
H i2 {x) = 



H^ 2 (x)dx 



(2hj sinc(2/ lr )e- 27rifcl/23 XF^(V2^,e 2 ))) (-x) 

2hj sinc(2%-)e" 27rifel/2 "'') V (-x) ((i J W w )(-/2*,6)) V (-x) 
l[-fc, lhj] (-x - fci/2*) ((iW)(-/2U 2 )l)) V (-*)• 



Hence, since /ij < p, 



-ki/23+p 



H^ 2 (x)dx 



k 1 /2i+h 3 



{{FW v )(-/2^i 2 )Y(x)dx 



pKi+z j rij 

/ ((fr)((,6))) v (^ 

Jki-2ih,4 



Notice that this indeed makes sense, since due to the masking the length of the line singularity 
isn't allowed to play a role here. Due to the regularity of W, there exists some constants N 2 and c 

|((FW)(|(-,-)|) V (x)|<c(|x|}-^. 

Hence, 

||G||oo < c(min{|A:i - 2 j hj\, \h + 2 j hj\})- N2 . 
Combining this estimate with (12), we obtain 

\(M hj w£j,ipx)\ < c(\k2\)- Nl {wm{\ki - 2?hj\, \h + tfhjl})-**, 

which is what we intend to use. 
Finally, 

^ \(M hj wCjM\ < c( ^(|A; 2 |)- JVl (mm{|fc 1 -2 2 ^|,|A ;i + 2 2 ^|})-^ + 



+ ^(\k 2 \)- Nl (mm{\k 1 -2 2 3 +1 h j \,\k 1 + 2 2 i +1 h j \}y 



-A ? 2 



< C. 



□ 



26 



Notice that this result holds for any Tj, which again is intuitively clear since if it holds for the 
claimed on, then extending the set 7j does not change the estimate due to the fact that Mh wCj 
is zero "outside." 



We now apply Proposition 2.9 to Lemmata 3.1, 4.5, and 4.6 to obtain the desired convergence 
for the normalized £2 error of the reconstruction Lj from One-Step-Thresholding in Figure |6j 
Again, in this case x = wCj and $ are wavelets ip\ at scale j. 

i{2~^) and Lj the solution to with <3? the 2D Meyer Parseval system, 

\\Lj — wd 



Theorem 4.7. For h-i 



j 3 - LV^j|| 2 



0, 



J 



00. 



This result shows that One-Step-Thresholding fills in gaps of the same size as £\ minimiza- 
tion (Inp) in an asymptotic sense when considering the £2 error. 



5 Shearlet Inpainting Positive Results 



In this section, $ is the shearlet frame as in ([I]) in Subsection 1.2.2 The general approach in 
this section is the same as in the preceding section. We show that use of the analysis coefficients 
of the shearlet system through either £\ minimization or thresholding will successfully inpaint a 
line across a missing strip. Namely, in Subsection 5.1, we investigate the inpainting results of 



£\ minimization by estimating the <5-clustered sparsity 5j and cluster coherence \i c with respect 
to {a v : r\ = (i,j,£,k),i 6 {h, v, 0}; \£\ < 2^;k G Z 2 } and a properly chosen index set Aj. In 



Subsection 5.2, we similarly give the estimation of Sj and \x c for inpainting using thresholding. 



Some of the proofs in this section are very similar in spirit to the corresponding ones in Section [4] 
but decidedly more technical due to the structural difference between wavelets and shearlets. The 
auxiliary functions ([8]) and (13) in the proofs of Lemma 4.2 and Theorem 5.3 demonstrate this 
relationship quite well. 



5.1 l\ Minimization 

For our analysis we choose the set of significant shearlet coefficients to be 

Aj = { (t; j, k,£) : \ki\ < pnj2 3 , \ k^ \ < rij , £ = 0; 1 = v} 

where we revive the notion rij = 2 e2 - 7 from the previous subsection. 

Now we can show the clustered sparsity of the shearlet coefficients with the choice of Aj . 

Lemma 5.1. For e < 1/4, 

5j=o(2 3 ), j-^00. 

Proof. By the definition, we have 

83 = \{ w £ji< J j,e,k)\+ Yl \( wC j' a jAk)\+ Y \( wC ^ a h,k)\ + 

|fc 1 |>pnj2J',|fc 2 |<n J /=0 \ka\>nj,l=0 fceZ 2 ,^0 

fcez 2 ,£ kez 2 
= : T 1 +T 2 +T 3 + T 4 + T 5 . 
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To estimate T\, we first estimate (wjC, <j v ) for the case £ = and l = v. By Lemma 8.3 in Section|8j 

(wC,al kfi ) < c N aJ 1/2 (\k 2 \)~\[k 2 + a^min(a 3 k 1 ±p) 2 ] 1 / 2 ) 2 - N 



< c N a- 1/2 {\k 2 \)-\[kl + min(a- 1 fc 1 ± aj 2 p) 2 ] 1 / 2 ) 2 '" 



. fA(l .-V2 / , 7 .. h -J/..-2„. i .,,., ,.. , ^2-.Y 

Therefore, we have 



< CN<ij (\k 2 \) {a- m_in|ajA;i ±p|)" 



Ti < cjsrdj 1 ' 2 a j e ^ (a^ 2 mm |ajfci ± p)\) 2 A 



< Cjv aJ 1/2 a7* V (mmlaT^iiaT^)!) 2 -^ 
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Note that aj 2e = nj = 2 2j€ . Since 



/ {\afx -a- 2 p)\) 2 - N dx = aj [ (\y - aj 2 p\) 2 ~ N dy 

J\x\> P a-^ 3 3 J\y\> P af-^ J 

< a/f J\y\) 2 - N dy<c N a] +2 ^\ 

we obtain 



\y\>p*j 2 



rp . l/2-2e+2(7V-3) 

For T 2 , we have 

72 < Y, (^ + min(aT^ 1 ±aTV) 2 ] 1 / 2 ) 2 -^ 



-1/2 - 1 ± 

CjV °J fc ie Z,|fc 2 |>a- 2e 



< £ ([fcl + mmfaT^xia-^) 2 ] 1 / 2 ) 2 - 

|fci|<pa7 1_2 Mfc 2 |>a- 2e 

+ Yl ([k 2 + mh,(aj 1 k 1 ±aj 2 p) 2 } 1 / 2 ) 

=: T 2 ,i + T29- 



|fci|>paT 1 - 2 Mfe 2 |>aT 2£ 



For T2 i , we have 



r 2 ,i < c / / (H) 

JlxiKpa" 1 -^ J|x 2 |>a- 2f 



~ N dx 2 dx\ 



, -l+2(iV-4)e 



For T2 5 2, we have 

T 2 ,2 < caj 



/ / (|(x 1 ,X 2 )|) 2 - Ar dX2^ 1 

Jxx>pa - 2 2e Jx 2 >a, 2e 



< caf N - 3)(1+2e \ 
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Therefore, 



rp ^ -3/2+2(AT-l)e 
T 2 < c N a j 



For T3, we convert the result in Lemma 8.4 in Section [8] to the discrete case. 
Lemma 5.2. Let t\ = a 2 -{k\ — £k 2 ) and t 2 = a,jk 2 with a,j = 2~K 
(i) For t\ 7^ and t 2 7^ 0, we have 

\(wCj, o-j^ jk )\ < CNe~ ca i a~ 1//2 |a^(A:i - £k 2 )\ N \a,jk 2 \~ N af , 



and 



-12 



\{wCj,(jj^ k }\ < c^e ca 3 a- l l 2 \aj(k\ — £k 2 )\ N \a-jk 2 \ N Oj N 
(ii) When exactly one of t\ or t 2 is and 1 £ {h, v}, we have 

\(w£, cr^ >fc )| < cl [max{a||fei - ^fe 2 |, aj|/c 2 |}] L a~ X ^ 2 e~ ca i> 

(hi) For ti = t 2 = and i G {/i, f }, we have 

For ti := a^(A;i — tk 2 ) 7^ and t 2 := a>jk 2 7^ 0, we have 

3 



|aj(^i— ^2)1 l%'^2| N < — £x 2 )\ \djX 2 \ N dx\dx 2 



< c- I \x\\ N \x 2 \ N dx\dx 2 

|xi|>l,|a!2|>l 



< OO. 

Hence 

|a 2 (/ci — £k 2 )\ \ajk 2 \~ < caj 3 . 

fcez 2 ,ti^o,t 2 ^o 

Similarly, for t\ = or t 2 = 0, we have 

[max{a 2 |£;i — £k 2 \, aj\k 2 \}] N <ca~ 3 . 

feGZ 2 ,ti=o or t 2 =o 

The estimate for (iii) follows by direct computation. Therefore, by the above estimates (i), (ii) 
and (iii), and that 

-J 1 

T 3 = E E \( w ^iAk)\ + Ei(^i>^,o)i, 

e=i fcez 2 ,(ti,t 2 )^o «=i 



we obtain 

a- 

-1/2 -ca, 1 / -3 , i\ / „ ./V 



T 3 < ^ C7 va~ 1/2 e- c ^ (aj 3 + 1) < CAra f ViV > 0. 



=1 
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Similarly, for T4, 



J 

T 4 <Y, CNaJ^e-™' 1 {af + 1) < c N af ViV > 0. 



Finally, since the "seam" elements crj^k are om Y slight modifications of the f^ fc , T5 < Cjv«j for 
all .V > 0. 

Combining the estimates for Ti, . . . , T5, we are done. □ 
Next we estimate the cluster coherence 

and show that it converges to zero as j — > 00 when hj is related j by hj = o(2~ J ) as j — > 00. We 
wish to remark that the size of the gaps which can be filled with asymptotically high precision is 
dramatically larger than the corresponding size for wavelet inpainting. 

Theorem 5.3. For hj = o{2~i) 

Ai c (Aj, {M hj a v }; {a v }) ->• 0, j ->• 00 
with r\ = (i, j, fc) and i G {h, v, 0}. 
Proof. We have 



/./, 



:(Aj,{A^, ov,};^??}) = max V |(A^ ft ,cr T;i , cr^ 2 )| 

f?2 ^ — ' 



< max V" \{M hi a Vl ,a V2 }\ + max V" |(A^ h .cJ m , cr % )| + max V] K-M/,.^, cr. 

=: Ti+T 2 +T 3 . 
We bound Ti using simple substitutions: 



1)2/ 



Tl < Y, \(M hj o- v jAk ,*lo, )\ 
{i;j,e,k)eAj 



< 



(i;j,£,k)£Aj 



R 



2hj sine (2/i j^i) 



R 2 



2- 3j W(^)w( (Tl 22 ^' T2) ) l/(f + 2^)7(2^)x 



xe 



-2iri<t,(T-Ki,0))A«_ j S?> dr 



< 2(2 j hj) V / sinc(2- ? 2/i i ^i) / 



xe 2« itl 2it le -lMt,Al /a .T) dT 



/R2 2 j V ^ J r 2 r 2 



< 2(2%) Y [ 9 3 (r)e- 2m{t ' A ^ } dr 



(t,;j,e,k)eAj 
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where t = A^ a S^k with aj = 2 and 



&(r) := / r sinc(2^a)^ (/+ ^) e W ^iW g,r 2 ) W (^^2) V (^) . (13) 

Note that the support of W(ti/2-? , •) and of W ^ Tl 2 ^ 1 ") of variable r 2 is independent of j and the 

support of V{-/t2) of variable n is depending only on r 2 . Hence, gj{r) is smooth and compactly 
supported on a box E of volume independent of j, 

\J gAr)^dT\<c N \\g j \\ 00 (\t\)- N . 

Note that 

||^||oo<c(2^)-V2 ; 

therefore, 

T 1 <c(2>h j ) 1 / 2 Y,(\k\)~ N ^0,3^<x>- 
kez 2 

We now bound T 2 : 

(i;j,e,k)eAj 



< 



(t;i/,fe)eA. 

' 2/ij- sinc(2^ei)< jiS , (r - (6, 0))^. iS , )0 (r)d6 
R, 



(i;j,e,k)£A 



R 2 



-27ri(t,r-(a,0)> d7 



777,- a JR 2 



(t;j,^,fc)6A 



where 



&(r) := / 2^- sinc(2^a)<, s ,o(T- (a,0))^. )S , i0 (r)e 2 - t ^^a- 
Using integration by parts, we obtain 

|/ ^(r)e- 2 ^W| < c L)A f<|ti|)- L <|t2|)- M ||I> L '^||ooSupp(^) 



where 

|Z^I < 2hj f |sinc(2/ lj ei)|| J D L ' M (^ iSi0 (r-(ei,0))^ iS , (r))|da 

< 2h j \\ sinc(2/ l ,-)|| 2 || J D L ' M (^, Si0 (r - (, 0))o* >y> (r))|| 2 

< c I)tf 2fcf ||^ M « 3 , s ,o(r- (,0))^. )S , )0 (r))|| oo a- 1 . 
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Since 



d N 

d N 



(K,s,o°a, s >,o) = 0(a/'af) and 



Consequently, as j — > oo, 



T 2 /i 



1/2 



•W„-4„-l„3/2 2iV 

'j "i 



(L;j,£,k)eAj 



< a ■ /ij — > 0. 



By construction, T 3 < 2 _1 / 2 (ri + T 2 ). 



□ 



Notice that - in contrast to the wavelet result - here we require the stronger condition (2 3 hj) — > 
as j — t- oo to handle the additional angular component. 



We now apply Proposition 2.7 to Lemmata |3.1[ [57T[ and 5.3 to obtain the desired convergence 
for the normalized £ 2 error of the reconstruction Lj from In this case L = wCj and <3> are 
shear lets o-j £k at scale j. 

Theorem 5.4. For hj = o(2~ 3 ) and Lj the solution to with <I> i/ie shearlet system defined using 
the Meyer wavelet 



\Lj - wCj\\ 2 



0, 



j — > 00. 



||u>£j|| 2 

This result shows that we have asymptotically perfect inpainting as long as the size of the gap 



shrinks faster than 2 3 . The similar result for wavelet inpainting, Theorem 4.3, only guarantees 
such successful inpainting when the gap is asymptotically smaller than 2~ 2] . 

5.2 Thresholding 

Our first claim concerns the set of the thresholding coefficients Tj := {77 = (t; j, £, k) : | {wCj, a^) \ > 
f3j} for some (3j > 0. 

Lemma 5.5. For hj = o{2~ 3 ) as j — > 00, there exist thresholds {(3j}j such that, for all j > jo, 

{(t;j,£,k):\ki\ < P 2 2 ^ l+Ul \ \k 2 \ < 2 2ju \ I = 0; 1 = v} C Tj 
for some jo, ^1, and v 2 < 1/4. 
Proof. We first observe that 

|((1 -M h] )w£j,a] Ak )\ = \(5o-kwCj,d- v jAk ) - (M hj -kwCj,a v jAk )\. 
The first term equals 



(5 *w2j,a] Ak ) = 2 3 / 2 



w(Ci)F(^/2 23 , 6)W(6/2 2J , ^)V(e+2-^ 1 /^ 2 )e- 2 ^ b ^dCi 



(14) 
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whereas, by using Lemma 3.2, we derive for the second term 



(M hj wCj,al £7k ) = 2hj J sinc(2^Ti) J ^(6)^(6, 6)^, fe (Ci + n,6Kdn 



21/2 S J^ 2hj J smc ( 2 ^ 7 "i)i r (a/2 2i ,6)x 
x W (6 /2 2 ^' , £ 2 ) V (£ + 2- J T A±^L ) 6-2^(61 ,ri+a> dTi ^ 

42 



= : 2^y G(6)e- 2 ^< 22jfe ^^e2. 

By standard arguments, we can deduce that 

\((l-M hj )wC 3 ,a] Ak )\ < c Nl 23^\\G\U\2 2 3 b2 \r N \ 
By b 2 = k 2 /2 2 i due to b = (A%_ j SF e ) T k, we have 

|<(l-A< fc >/**M*>l< ^^HGHoodfcal)-^ (15) 
Let us now investigate the term Halloo further. We define 

= F^ 1 /2 2 3,^)W(Ci/2 2 3,^)V(e + 2-^ 1 /C 2 )-2h j j faw{2h j Ti)F(t2/2 2i ,h)W(Zi/2V,£ 



2 X 



xF(l + 2" 



; £l + Tl , 
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-27ri(6i ,n) 



and hence need to analyze 



\\G\V 



(16) 



By Plancherel's theorem and the support properties of w, 



bi+p 



Ht 2 {x)dx 



We now need to compute H . Using well-known properties of the Fourier transform, we manipulate 
H^ 2 to obtain 

H i2 (x) = (F(-/2 2 3,&W(-/2 2 3,Z 2 )V(l + 2-3(./a 2 ))) V (x) + 

-((2/ lj sinc(2/ l ,0e- 2 ^>(F(V2 2 ^42)W(-/2 2i ,42)U(£ + 2^(./4 2 ))) V (-x) 

= (F(./2 2 ^6)W(V2 2 ^,42)U(£ + 2^(./4 2 ))) V (x) + 

- (2h 3 smc(2h r )e- 2 ^y (-x) (F(-/2 2 3, &)W(-/2 2 3, &)V{1 + 2~i • /4 2 )) V (-x) 

= (F(-/2 2 3\&W(-/2 2 3,t 2 )V(l + 2-3 {■/&))) V (x) + 

-l[- hj , hj] (x - h) {F(./2 2 3,a 2 )W(-/2 2 3,Z 2 )V(t + 2-3 ■ /4 2 )) V (-x). 
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Hence, 


since hj < p, 






p-h+p 






/ H^ 2 (x)dx 






J-H-p 





bl+P (F(./2^ 6)W(72 2 ^, &)V(£ + 2^' (•/&))) V (x) 
5l+ " 3 (F( -/2 2j , 6)W(-/2 2J , £ 2 )V(£ + 2-i (-/6))) V (x)dx 

61 — hj 



+ 

Z>(bi-p) 725(61+^) 



F(72^,6)W(V2 2i ,6) x 

xy(^ + 2^(7^))J (ar)dx . 

Notice that this indeed makes sense, since the values k\ "in between hj and p" should play an 
essential role. As already observed in the proof of (15), we have b\ ~ k\/2 3 for j large and small 
\£k2\ (since b\ = 2~ j k\ + 2~ 2 Hki), and hence 
-h+p 

H(x)dx 



bi—p 



+ 



F{-/2 2 ^ 6)W(-/2 2 ^, &)V(l + 2~ J ' (•/&)) J (x)dx 

>ki-2ip Jk 1 +2Jh 

Notice that this fact also implies that the function 

(F(-/2 2 ^, 6)W(-/2 2i , S 2 )V{1 + 2-''(./6))) V 
is independent of j. Due to the regularity of W, there exist some N2 and c such that 

I (F(-/2 2 ^6)W(-/2^,6)^(^ + 2- i (-/6))) V (^)| < c(|x|)-^, 
and hence by ( [l6| ) and the previous computation, 

IIGIU < cimmilki-VpHla + Vpl})-"'. 
Finally, we study how the term H relates to hj. For this, we set 

4fa) = ^(6/2 2j , 6)w(a/2 2i , 6)^+2 



(17) 



,-j6_+^l\ e -27ri(6i,n> 



6 



Now, 
l#& (6) 



F(£i/2 2i , 6)W(6/2 2i , 6)^ + 2- J 'ei/6) - 2hj J sinc(2^ri)F(6/2 2i , 6)W(6/2 2i , 6 

a + n )e _ 2m(6l , T1>(iTi 



2) x 



6 



xy(£ + 2- J - 

|J 6 (0)-2/i j / smc(2/t J ri)J 6 (ri)dri| 

l4(°)- / i[-/ 1J) /i J ]fa)4fa) dr il 



l4(o) 



J^ 2 (x)da 



\x\>hj 



J^ 2 {x)dx\. 
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Hence another way to estimate (16) is by 

Halloo — c| -ZTI qo 



< cmax 

6,6 



< cmax 

6,6 



\x\>h 



(F{t l /2 2 \t 2 )W(t 1 /2 2 >^ 2 )V(t + 2-i^)y(x-b 1 )dx 

6 



\x\>2ihj 



m 1 /2 2 \&)W(Z 1 /2 2 \Z 2 )V(£+ ' + l ^ 1 )) v (x-2^b 1 )dx 

6 



Certainly, the minimum is attained in the center of the mask, i.e., with 6 = 0. So by combining 



this with (15) and (17) 



\{{l-M hj )wC v a% k )\ < c2i 



max 

\x\>2 2 ihj 6,6 



\x\>23hj 



(F(6/2 2 ^',6) 



xW(gi/2 2 ',&)y(l + ' + l 3il )Y{x-2 2 ib 1 )dx 

6 



;(mm{\k 1 -2 2 ip\,\k 1 + 2 2 ip\})- N H\k 2 \y 



■Ni 



which is what we intend to use as a "model." Observe that this indeed is the right intuitive estimate, 
since the k 2 component has to decay rapidly away from zero thereby sensing the singularity in zero 
in this direction. In contrast, the k\ component stays greater or equal to (2 2j p)~ N2 up to the point 
2p2 2 ^ and then decays rapidly in accordance with the fact that until the point k\ = p2 2 i we are 
"on" the line singularity which decays smoothly up with w. Also, the required angle sensitivity is 
represented. Finally, the first term models the behavior in the mask, which is also nicely supported 
by the fact that the crucial product 2 2 ^hj is appearing therein. Set 

J(-) = F(./2^,6)M-/2 2J , 6)^(^ + 2^(76)). 

Since 2 3 hj — > as j — > oo, letting j — > oo we have 



/ J(x)dx 

J\x\>2ihj 



< c. 



We now use 

P = c2^ 2 (C - e)(|2^|)- iVl (min{|(2^ - l)Vp\, \(2 je + 1)2^ p\}y N2 
as a threshold. It follows immediately that, for all j > jo, 

{(i-j,£,k):\ki\ < p2 2 ^ 1+ ^\ \k 2 \< 2 2 ^, £ = 0;t = v}QT j 



for some jo an d v\. 
Lemma 5.6. 



□ 



^2 \(wCj,a v )\ = o(2 J ), j oo. 
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Proof. We observe from the proof of Lemma 5.1 that the desired property is automatically satisfied 



provided that, for all j > jo, the set Tj contains 

{(i;j,£, k) : |fci| < p2 2 ^ 2 +^\ \k 2 \ < 2 2 ^\ i = 0, i = v}, 



for some v\ > 0, which is the content of Lemma 4.4 



□ 



We next analyze the second term in the estimate from Proposition 2.9 
Lemma 5.7. For hj = o{2~ 3 ) as j — > oo, 

^2\(M hj w£ j ,o- rl )\ = o(2 j ), j-^oc. 

Proof. First, we need to derive some estimates dependent on (k,£) for the term \ {M.h i wCj,a L ^ k ) 
By using the definitions of Mhj and wCj and a change of variables, we obtain 



(MhiW£j,o-j 



j,i,k/ 



2 J/2 



wfa^hj J smc(2/ Ji r 1 ) J P(ei/2 2 ^6) 



:W(£ 1 /2 2 i,&)V(l + 2-^'^±il) e - 2 ^i(^+a) dn ^ 1 

6 



-27ri(2 2 Jfe 2 ,?2> 



Let G now be the function 

£(6) = J tifo^hj j smc(2fyTi)i^/2 2 ^6)W(£^ 

This function is supported on the set [1/16, 1/2], which is independent of j. By standard arguments, 
we can deduce that 

\{M hj wC h al^)\ < c^^HGlUdfel)-^. (18) 
Let us now investigate the term ||G||oo further. We define 

%(6) = s inc(2/ lj r 1 )F(6/2 2 ^6)W(ei/2 2 ^6)^(£ + 2^^^)e- 2 ^<'' 1 ^ + «^ ( iT 1 , 

and hence need to analyze 



IGII 



-2^(61^1)^ 



(19) 



By Plancherel's theorem and the support properties of w, 



-27Tt<6l,Ci) 



|(t&^ 3 ) v (-6i)|«c 



-&i+p 



H^ 2 (x)dx 



Next, 



= (^(2^ sinc(2/ lj -)e- 2m,,1 -)*(i ? ( 2 2j, 6)^(^,6)^(^ + 2^(76))); (-s) 



2/ij sinc(2^.)e- 2 ^j (-*) (F(./2 2 ^, £ 2 )W(-/2 2j ', 6)^ + 2~' (-/6))) (-*) 

i[-^^](-x-6i)(F(72 2 ^6)>v(-/2 2j ',6)^ + 2- J (76))) v M). 
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Hence, since hj < p, 
—fei+p 



b— i-p 



H^, 2 (x)dx 



bi+hj 

{F(-/2* ,&)W(-/2 2j \t2)V(e + 2-3{./Z 2 ))y(-x)dx 

b\—hj 



{F(./2i,&)W(-/2i,t; 2 )V(e+(-/b))) V (-x)dx 



Notice that this indeed makes sense, since due to the masking, the length of the line singularity is 
not allowed to play a role here. Since (k,£) G Tj, we have 



-&i+p 



—h—p 



H{x)dx 



(F(-/2>, &)W(-/2', &)V{1 + (./e 2 ))) V (-x)dx 



Due to the regularity of W, there exists some N 2 and c (possibly differing from the one before, but 
we do not need to distinguish those) such that 

|(F(V2^6)W(-/2^6)^ + (-/6))) v (-x)| <c(\x\)~ N \ 



and hence by (19) and the previous computation, 

||G||oo < c(min{|A:i - 2?hj\, \h + 2 j hj\})- N ' 2 . 



Combining this estimate with (18), we obtain 

\(Mh s w£j,o-jt k)\ < c2^ 2 (|A; 2 |)-^(min{|fc 1 - 2 3 hj\, \h + 2^,1})"^, 



which is what we intend to use. 
Hence, 



- \(M hj w£j,a v )\ < 2i/ 2 ^(N)^ 1 ^{1^-2^-1, 1^ + 2^-1})- 



■N 2 



veTj 



veT 3 
< 2 2 i( 1 / 4 + I/ 2)_ 



Since v 2 < 1/4, the lemma is proven. 



□ 



We now apply Proposition 2.9 to Lemmata 3.1, 5.6, and 5.7 to obtain the desired convergence 
for the normalized i 2 error of the reconstruction Lj from One-Step-Thresholding in Figure |6j 
In this case x = wCj and <& are shearlets a L - 1 k at scale j . 



Theorem 5.8. For hj = o(2 3 ) and Lj the solution to (J^Jj with <3? the shearlet system defined using 
the Meyer wavelet 



\Lj - wCj 2 „ 

IIWjCj ||2 



J -)• 00. 



This result shows that if the size of the gap shrinks faster than 2 3 , the gap can be asymptotically 
perfect inpainted. 
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6 A Comparison of Shearlet vs. Wavelets 

From the results of previous sections, we see that the size of the gaps which can be filled by 
shearlets (hj = o(2~i)) with asymptotically high precision is larger than the corresponding size for 
wavelets (hj = o(2 -2 -')); however, certainly we still need to prove that we cannot do better than the 
presented rates for wavelet in order to show that shearlets perform better than wavelets. In fact, 
we show that the rates presented for wavelets are indeed the "critical scales" for the thresholding 
case. 

Theorem 6.1. Let ip\ be the Meyer Parseval wavelets. Let T be a index set such that 

T2{(i,j,0 ) (ti I 0)) : N < 2 2 ih j -K Q } 
for some Kq > and hj > 0. Then, we have 

y £\(M hj wC j M\=OQ?%). 

AeT 

Proof. Recall that at level j, the signal wC is filtered with the three corresponding frequency strips: 

p i = E {W\2- 2 ^) + W\2-^- 1 i)) 

ie{h,v,d} 

with 

Fj= W\2-^) 

idh,v,d 

so that 

Fj = F 2j + F 2j+1 . 

We can consider each of the filtered signals; i.e., consider wCj := wC-kFj with i = v,h,d. Since 
the signal is a horizontal line segment, we only need to consider wCj. For simplicity, we denote 

wCj := Fj := Fj 1 , and = =■ 4>j,k- Note that F/(x,y) = 2 2 ' </>(%> x)W (2? y). We want 

to estimate the coefficients \(M-h 3 wCj,^\}\. As with other proofs for wavelets, we first consider 
wCj. By definition, we have 



{M hj w£j,ip x ) = / wCj(x,y)ip x (x,y)dydx 

J\x\<hj JyeB, 



\x\<hj JyeB, 



(wC * F v )(x, y)ipx(x, y)dydx 



'\x\<hj JyeB, J zeB? 

Now, by the definition of w£, we have 



= / / / w£(zi,z 2 )Fy((x,y) - (z 1 ,z 2 ))dzil>x(x,y)dydx. 

J\x\<h4 JyeB, J zeB? 



{M hj wCj,yjx) = j j j w(z)F^(x - z,y)dzyj x (x,y)dydx 

J\x\<hj JyeB J-p 

~ cf [ [ Fj(x - z,y)dz^x(x,y)dydx 

J\x\<hi JveB J-o 
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f-p 

2 2] <f){2 ] {x - z))W(2 ] y)dz2 J (j)(2 : >x - ki)W(2?y - k 2 )dydx 

'\x\<hj J ye~R J —p 

c2 j [ W(2 j y)W(2 j y - k 2 )dy2 2j I I" (j)(2 j (x - z))dz0(2 j x - k x )dx 
JyeR J\x\<hj J-p 

c2 2j f ^ (f)(2 j (x - z))dz(j)(2 j x - k x )dx 

J \x\ <hj J —p 
n r2'p+2^x 

c2 j / / (f>(z)dz<j>(2 j x - h)dx 

J\x\<hj J-23p+2ix 
~-k 1 +23h j ^p+x+k-i 

4>(z)dzcj)(x)dx. 

-2ip+x+ki 

For each x £ [—k\ — 2 1 hj, —k\ + 2 J hj], we have x + k\ G [— 2 J hj, 2 3 hj\. Consequently, we have 

[-2 j p + x + h,2 j p + x + kx] D [-2 j {p - hj), 2 j {p - hj)} 

for all x £ [—k\ — 2 3 hj, —k\ + 2 J hj]. Note that p > hj. Hence, when j is large enough, we have 
/^2?p+x+fci (fti^dz ~ c ^ due to J (f>(x)dx ^ 0. Therefore, we have 

(MhiWJOj^x) « c / +/ )(j)(x)dx. 

yj-k^^hj j-k 1 -2^+ i h j j 

As J (p(x)dx 0, there exists Kq > such that 

4>{x)dx > Co 

\x\<K 

for some Co > as long as K > Kq. Hence, when j is large enough so that 2 2j hj > Kq and 
h G [-(2 2j hj - K ),2 2 ih - Kq], we have about 2 2 - J 7i j - Kq many coefficients that are larger than 
Co. Consequently, when j is large enough, we have 

£ \(M hj wCj,il)x)\ =0{2 2 ^h j ) 
keT 

as long as the index set T 5 {(*•, j, 0, (&i,0)) : |&i| < 2 2j 7ij — i£"o}- 

For the other orientations wC"j and wCj, the coefficients are negligible following calculations 
similar to above. □ 



In the proof of Proposition 2.10, we have 

\\x*-x°\\ 2 = \\$1tc$*Pkx° + $1 t $*Pmx°\\2 =■■ ||Ti + T 2 || 2 > ||T 2 || 2 - ||Ti|| 2 . 

In the wavelet threshold case, the first term corresponds to T\ = J2keT c \ { w ^3^>)\-> wriUe the 
second term corresponds to T 2 = J2keT^^ h 3 ' '^A/'V'a) f° r some index set T. As shown in 
the wavelet threshold, to guarantee that the first term ||Ti|| 2 is small, the index set T is chosen 
such that T 5 {(h,k 2 ) : \h\ < p2 2 i [l+Vl \ \ki\ < 2 2 ^ 2 }- But then the second term ||T 2 || 2 will 
be of order 0(2 2j hj) as shown above. If hj decays slower than order of 0(2~ J ), then we have 
\\Lj — wCj\\ = 0(2- ? ). Thus, we have the following theorem: 
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Theorem 6.2. For hj = lo(2 ■?') and Lj the solution to ^ where <I> is the 2D Meyer Parseval 
system, 

'-ills 



\Lj — wC-i 



\wC 



j\\2 



0, j oo. 



That is, the wavelet threshold method does not fill the gap. Heuristically, one can think about 
the situation when the gap size hj is fixed as 1. Consider the wavelets 2 J <f)(2 :, x — ki)W(2 :, y). Then 
as j — > oo, the number of such wavelets that fall in the gap is about 0(2 2j ). The norm (Ai^^wC, tp\) 
for any such wavelets in the gap is about the same. Consequently, the total energy concentrated in 
the gap will be about 0(2 2 - ? ). 



When 2 2 ^hj — > and since |</>(x)| < cat(|x|) for any N, we have 



\{M hj w£j,il>x)\ < c2 2 ih J (mm{\k 1 



2^1,1^+2^1}) 



-N 



For the Meyer mother wavelets W v = W(x)(f>(y) and W d 
holds. In this case, the threshold method fills the gap. 
Contrasting Theorem 



W(x)ip(y), the above inequality still 



5.8 



and Theorem 



6.2 



we see that when the gap size hj decays like 2 J , 
the using the One-Step-Thresholding algorithm produces a good approximation of the original 
image if shearlets are used but does not if wavelets are used. 

Figure [9] shows a comparison of wavelet- and shearlet-based inpainting results. In the left 
column, a seismic image containing mainly curvilinear features is masked by 3 vertical bars. Using 
2D Meyer tensor wavelets or shearlets - we refer to the ShearLab package in |www . s hearlab . org for 
codes of shearlet transforms -, the coefficients of the masked image are computed. After applying 
the threshold and applying the backward transform we derive a first approximation of an inpainted 
image by leaving the known part unchanged. These steps are then iterated with the threshold 
becoming smaller at each iteration. The outcome is illustrated in the middle column of Figure |9| 
The last column is the zoom-in comparison. From this, we can also visually confirm that the shearlet 
system is superior to the chosen wavelet system when inpainting images governed by curvilinear 
structures such as the exemplary seismic image. 




Figure 9: Left column: original image and missing data. Middle column: wavelet inpainting and 
shearlet inpainting. Right column: wavelet zoom in and shearlet zoom in. 
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7 Extensions and Future Directions 



As mentioned previously, we believe that this work and [KKZ11 make important steps in a new 
direction of theoretical analysis of inpainting problems. When taking into account the similar 
results concerning geometric separation in [DK12J and |Kutl2| . clustered sparsity could provide a 
new paradigm to prove theoretical results in a variety of problems involving sparsity. With this in 
mind, we mention possible extensions of this work as well as current limitations. 

• More General Singularity Models. We anticipate that our results can be generalized to a much 
broader setting. In [DK121 IKutl2j , curvilinear singularities were segmented and flattened out 
using the Tubular Neighborhood Theorem. This was done in such a way as to be able to apply 
results concerning the clustering of curvelet coefficients along linear singularities to curvilinear 
singularities. Using this technique, the results in this paper concerning line singularities wC 
should be able to be extended to curvilinear singularities. 

• Different Masks. In this paper, we focus on a vertical strip as mask. However, after rotation 
other typical masks are locally vertical strips, and the analysis in our proofs occurred locally 
around the missing singularity. It is possible to think of a ball with radius h as mask, in 
which case similar results should be obtained. Other imaginable shapes could be horizontal 
strips, flat ellipsoids, and other polygonal objects. 

• Different Recovery Techniques. Both hard and soft iterative thresholding techniques are quite 
common and usually produce convincing results. The results in this paper concern one-step- 
(hard)-thresholding rather than iterative thresholding. As iterative thresholding is stronger 
than one-pass thresholding, we strongly believe that a similar abstract analysis can be derived 
leading to asymptotically precise inpainting results in this case. 

• Other Dictionaries. It should also be pointed out that the results in Section [2] hold for all 
Parseval frames. Furthermore, the asymptotic analysis in Sections [4] and [5] hold not only for 
the Meyer Parseval wavelets and shearlets, but also, for instance, for radial wavelets - or 
any types of wavelets with isotropic feature at each scale similar to the radial wavelets - and 
other directional multiscale representation systems such as curvelets. The necessary changes 
in the proofs are foreseeable. Also, the novel framework of parabolic molecules advocated 
in |GK12| could be applied. Furthermore given the construction of 3-dimensional shearlets 
in jGLlll IKLLlOl IKLL121 IKLLarj . it seems likely that the proofs in Sections [jj] and [§] will 
generalize in a straight-forward but technical manner to the 3-dimensional case. 

• Noise. Data is typically affected by noise, a situation we considered in the abstract setting. 
This analysis can be directly applied also for the wavelet and shearlet inpainting results, 
leading to the same asymptotical behavior, provided that the noise n is small comparing to 
the signal; i.e., the l\ norm of <£*n is of order smaller than the ti norm of filtered signal. 
However, in the literature, noise is typically measured by the £2 not the t\ norm. 
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8 Appendix: Decay of Shearlet Coefficients Related to Line Sin- 
gularity 

We present the idea of a continuous shearlet system in order to prove various auxiliary results. For 
i G {h, w}, a > 0, s G R, and t G R 2 , define 

K,sA') = a*/ 2 W{a 2 -)V\-A^_ s )e 2 ™^. 

It is easy to show that a b a s t = a~ z / 2 a ha,s {S'' s A i a _ 1 (• — t)) for some smooth function a L ' a ' s . For 
s = ±a, we similarly define the continuous version of the "seam" elements o~ at ± at t- The discrete 
shearlet system {o~j £k } is then obtained by sampling o- L a s t on the discrete set of points 

{i = h, v} x {a = 2~ j : j G N} x {s = I : £ G Z, |£| < 2 J '} x{ie A L 2 ^SL e Z 2 } 
U {i = |}x{a = r i :jeN}x{s = f:feZ, |£| = 2 j } x{t£ A 2 ^S L _ e Z 2 } 

To prove that the choice of Aj offers clustered sparsity for the shearlet frame, we need some auxiliary 
results. The following lemma gives the decay estimate of the shearlet elements. 
Note that if we define {\t\ a , S ;i) ■= {{S^A^tl), then 



KsAx)\ < c N a 3/2 {\x-t\ a ^) 



-N 



The following lemma is needed later for estimating the decay coefficients of the shearlet aligned 
with the singularity. 

Lemma 8.1. Let the line segment with respect to (a,s,t;v) be Seg(a, s,t;v) := {SgA^ i _ 1 (x — 
h,-t 2 ) : \x\ < p}. Then 

1. Given the line 

Line(a, s, t; v) := {S^A^ (x - h, -t 2 ) : x G R}, 
the closest point Pl to the origin on this line satisfies 

di - imii 2 - 1 + s2 *2- 

2. Set xq = fq^§*2 +ti. If P,s is the closest point on the segment Seg(a, s,t;v) to the origin, 
then 

d 2 := \\P S -Pl\\1 

min ± a- 2 (l + s 2 )(±p - x Q ) 2 x G [-p, p] 
x i [-p, p] 

Proof. Let L(x) := S^_i(x - ti, -t 2 ). Then 

\\L{x)\\ 2 = IKa-^s-tiJ.a-Vx-tO-a-^a)!!! 

= a~ 2 {x - ti) 2 + aT 2 s 2 (x - ti) 2 + arH\ - 2a" 3 s(x - h)t 2 
= a" 2 (l + s 2 ){x - ti) 2 + a~H\ - lor 3 six - t x )t 2 . 
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Solving ^||L(x)|| 2 = 2(x - ii)cr 2 (l + s 2 ) - 2aT 3 si 2 = 0, we have x = f+^i*2 + h. It follows 
that 



11^112 = \\L(X0)\\i = H L (^2 t2 + *=L > II 2 = ^—2*2 =■ <■ 

Note that Pl £ Seg(a, s, t; v) if and only if x € [— p, p], in which case ^2 = 0. Otherwise, 



4 = mm||L(±p)-P L || 2 

= mm||L(±p)-P L || 2 

= nun \\(a~ 1 (±p - x ), -a~ 1 s(±p - x ))||| 

= mm a~ 2 (l + s 2 )(±p — xo) 2 , 

which completes the proof. □ 
We need another auxiliary lemma. Note that 

Lemma 8.2. Define Rn(xo,Vo) ■= f^(\( x o, u)\)~ N da (which may be thought of as a ray integral) . 
Then for y > 0, 

R N (x ,yo) < TrdxoD-^Kxcyo)!) 2 "^. 
Proof. Choose (3 G (0, 1). Then 

/ |/(a)|da<( sup \f(a)f) / (/(a)! 1 "^. 
jo te(o,oo) Jo 

If we set (1 - fi)N = 2 and /(t) = (|(x ,y + a)|) _iV , then we obtain 

RN(x ,y )<( sup (\v\) 2 ~ N ) (\(x ,y + a\)~ 2 da. 

veR(xo,yo) JO 

Since 

/oo >>oo / \ -M 

/oo 
-oo 

fixing M = 2 and recalling the classic identity 7r = /^(l + a 2 )" 1 ^ yield the bound 

/•oo 

/ (|(x ,y + a)|)~ 2 dQ < Trflzol) -1 . 

JO 

Furthermore, since yo > 0, 

sup (|t,|) 2 ^ = (|(x ,yo)|) 2 - 7V . 

v£R(x ,y ) 

This completes the proof. □ 

43 



Now we can estimate the decay of the shearlet coefficients aligned with the line singularity wC 
as follows. 

Lemma 8.3. Retaining the notation as above, we have 

(wC,a v ast ) < c N R N (di,a \ / l + s 2 d 2 ) 

Vl + s z 



Proof. We have 



< c N ^^{\d 1 \r 1 (\(d 1 ,a- 1 Vl^d 2 \) 2 - N . 
Vl + s 2 



rp 

(w£>i°ls,t)\ = I / wi{x)a"Ax,Q)dx\ 



-p 

< I W V a,s,t(x,0)\dx 
J-p 

< c N a- z ' 2 f (\w\)- N dw, (20) 

J Seg(a,s,t;v) 

where we use an affine transformation of variables to turn the anisotropic norm 0)\ a s,t-v into the 
Euclidean norm \w 



Application of the same transformation to [— p, p] x {0} yields Seg(a, s,t;v). 
The integral in (20) is along a curve traversing Seg(a, s,t;v) at speed u\ = a _1 \/l + s 2 . If we let 



Ray (a, s, t; v) denote the ray starting from P$ and initially traversing Seg(a, s, t; v), then 
3 / 2 f /U..|)"^ < a^' 2 I (\w\y N dw 



a * (w 



Seg(a,s,t;v) J Ray(a,s,t;v) 



< a^v- 1 / (\w\)- N dw 

J vi Ray(a,s,t;v) 
-1/2 /-oo 

< 4=^/ (Kdi.t)!)-^ 

VI + s z J Ul d 2 
a' 1 / 2 

< R N (di,h>id 2 ). 
Vl + s 2 



□ 



Next, we estimate the decay of the shearlet coefficients associated with those shearlets not 
aligned with the line singularity. 

Lemma 8.4. Let t = (ti,tz). We consider the following three cases: 
(i) t\ and t 2 ^ 0. Then we have 

\lo.,r \\ ^ „ i + \-L\. i- A'l -1/2 -ca~ 1 s2M 

\{ wL , a a,s,t)\ < CL,M\ti\ |*2 1 a 1 e a , 

when 1 < \s\ < a -1 

\l„.,r ~h \\ ^ „ U \-M -1/2 „-ca~ 2 „M 

\{wL,a aiStt )\ < CL,M\ti\ |*2 1 a > e a 

and for s = ±a _1 

\{wL,a a ,s,t)\ < cr M \ti\- L \t 2 \- M a- 1 / 2 e- ca - 1 a M . 
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(ii) If exactly one of t\ or ti is 0, then we have 

| (w£, < m > I < c L \t\ + tlr^a-^e-™- 1 *, t = h,v. 

(iii) t\ = t2 = 0. Then we have 

\(w£,a< a ^ t )\<ca- 1 / 2 e- ca -\L = h,v. 
Proof. First, it is easy to show that 

° ° I ~v I ^ „ 3/2 L2M 

d £L d £M \ a a,s,o\ <c L , M a 1 a a , 
By definition of the line singularity wC, we have 

= /e- 2mt26 [|^(6K s ,o(ei, 



For ti and ti ^ 0, when we repeatedly apply integration by parts, we have 



\(wC,a^ Stt )\ <C\t 2 \ M \h\ L ||/il,m||li(r), 



where 



and for some function / which is sufficiently differentiable we define the multi index, 



D L ' M f( m , m ) 







drji J \ di] 2 







M 



The next step is to estimate the term |/il,m(^2)|- 
Let H ajS (^2) be the support of the function 



a^^' M (^(6)^, (a,6)). 

Note that for fixed a, s, the function £i i-> w(^i)a^ s (£i 5 £2) is supported inside [ca _1 |s|, ^a~ 1 s) for 
a constant c < \. Iil^m can then be written as 



h L M&)= [ £> i ' M (^(6)< s ,o(a,6))^i- 

We then rewrite the integrand as 



L 
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Thus |/ii,,Af(£2)| is bounded by 

\h L M&)\ < 

£=0 V J 

^ Ef^ll^OIUiica-iH,.- 1 !-!)^"^*) 

l— n V / 



w (6)^ M (< s , (6,6)Ri 



S(a, S )(6) 



£=0 



£=0 V 7 
<- „ c - ca ~ ls ^3/2 2M 

S cl,m& a a, 

where 

N L - e > M (a,s) = ||D L ^ M ^«i»6)ll^(B...(6)) 

Consequently, we have 

||^,m||li ( r) < c L ,„a- 2 e-^ ls a 3 / 2 a M 

Therefore, 

|(wL,CT a s t )| < c l ,m|ci| a 7 e a . 

Using the same approach, it is not difficult to show that for \s\ < a^ 1 , 

\(w£,v* Stt )\ < c LM \t 1 \- L \t 2 r M a' 1 / 2 e- ca - 2 a M , 

and for s = ±a~ 1 

L \^\- M n-^ P -™- x a M 



| {wC,a a , s ,t) | < cl,m |*i | 1*21 a 7 e 



The proofs for other cases are similar with simple modifications of the above procedure. □ 
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